自动化的设计数据归档可以减少设计师从创造性和有效工作浪费的时间。尽管存在许多有关分类,检测和实例对CAR外部的数据集,但这些大数据集与设计实践无关,因为主要目的在于自动驾驶或车辆验证。因此,我们发布了由汽车设计师定义的汽车样式功能组成的GP22。该数据集包含来自37个品牌和十个汽车段的1480个汽车侧面配置图像。它还包含遵循汽车外部设计特征的分类学特征的设计功能的注释,该特征在汽车设计师眼中定义。我们使用Yolo V5作为数据集的设计特征检测模型训练了基线模型。提出的模型的地图得分为0.995,召回0.984。此外,在草图上探索模型性能以及渲染汽车侧轮廓的图像意味着数据集的可扩展性是为了设计目的。
translated by 谷歌翻译
Human organs constantly undergo anatomical changes due to a complex mix of short-term (e.g., heartbeat) and long-term (e.g., aging) factors. Evidently, prior knowledge of these factors will be beneficial when modeling their future state, i.e., via image generation. However, most of the medical image generation tasks only rely on the input from a single image, thus ignoring the sequential dependency even when longitudinal data is available. Sequence-aware deep generative models, where model input is a sequence of ordered and timestamped images, are still underexplored in the medical imaging domain that is featured by several unique challenges: 1) Sequences with various lengths; 2) Missing data or frame, and 3) High dimensionality. To this end, we propose a sequence-aware diffusion model (SADM) for the generation of longitudinal medical images. Recently, diffusion models have shown promising results on high-fidelity image generation. Our method extends this new technique by introducing a sequence-aware transformer as the conditional module in a diffusion model. The novel design enables learning longitudinal dependency even with missing data during training and allows autoregressive generation of a sequence of images during inference. Our extensive experiments on 3D longitudinal medical images demonstrate the effectiveness of SADM compared with baselines and alternative methods.
translated by 谷歌翻译
Word Sense Disambiguation (WSD) is an NLP task aimed at determining the correct sense of a word in a sentence from discrete sense choices. Although current systems have attained unprecedented performances for such tasks, the nonuniform distribution of word senses during training generally results in systems performing poorly on rare senses. To this end, we consider data augmentation to increase the frequency of these least frequent senses (LFS) to reduce the distributional bias of senses during training. We propose Sense-Maintained Sentence Mixup (SMSMix), a novel word-level mixup method that maintains the sense of a target word. SMSMix smoothly blends two sentences using mask prediction while preserving the relevant span determined by saliency scores to maintain a specific word's sense. To the best of our knowledge, this is the first attempt to apply mixup in NLP while preserving the meaning of a specific word. With extensive experiments, we validate that our augmentation method can effectively give more information about rare senses during training with maintained target sense label.
translated by 谷歌翻译
Video-grounded Dialogue (VGD) aims to decode an answer sentence to a question regarding a given video and dialogue context. Despite the recent success of multi-modal reasoning to generate answer sentences, existing dialogue systems still suffer from a text hallucination problem, which denotes indiscriminate text-copying from input texts without an understanding of the question. This is due to learning spurious correlations from the fact that answer sentences in the dataset usually include the words of input texts, thus the VGD system excessively relies on copying words from input texts by hoping those words to overlap with ground-truth texts. Hence, we design Text Hallucination Mitigating (THAM) framework, which incorporates Text Hallucination Regularization (THR) loss derived from the proposed information-theoretic text hallucination measurement approach. Applying THAM with current dialogue systems validates the effectiveness on VGD benchmarks (i.e., AVSD@DSTC7 and AVSD@DSTC8) and shows enhanced interpretability.
translated by 谷歌翻译
Computational fluid dynamics (CFD) is a valuable asset for patient-specific cardiovascular-disease diagnosis and prognosis, but its high computational demands hamper its adoption in practice. Machine-learning methods that estimate blood flow in individual patients could accelerate or replace CFD simulation to overcome these limitations. In this work, we consider the estimation of vector-valued quantities on the wall of three-dimensional geometric artery models. We employ group-equivariant graph convolution in an end-to-end SE(3)-equivariant neural network that operates directly on triangular surface meshes and makes efficient use of training data. We run experiments on a large dataset of synthetic coronary arteries and find that our method estimates directional wall shear stress (WSS) with an approximation error of 7.6% and normalised mean absolute error (NMAE) of 0.4% while up to two orders of magnitude faster than CFD. Furthermore, we show that our method is powerful enough to accurately predict transient, vector-valued WSS over the cardiac cycle while conditioned on a range of different inflow boundary conditions. These results demonstrate the potential of our proposed method as a plugin replacement for CFD in the personalised prediction of hemodynamic vector and scalar fields.
translated by 谷歌翻译
手语制作(SLP)旨在将语言的表达方式转化为手语的相应语言,例如基于骨架的标志姿势或视频。现有的SLP型号是自动回旋(AR)或非自动入口(NAR)。但是,AR-SLP模型在解码过程中遭受了回归对均值和误差传播的影响。 NSLP-G是一种基于NAR的模型,在某种程度上解决了这些问题,但会带来其他问题。例如,它不考虑目标符号长度,并且会遭受虚假解码启动的影响。我们通过知识蒸馏(KD)提出了一种新型的NAR-SLP模型,以解决这些问题。首先,我们设计一个长度调节器来预测生成的符号姿势序列的末端。然后,我们采用KD,该KD从预训练的姿势编码器中提取空间语言特征以减轻虚假解码的启动。广泛的实验表明,所提出的方法在特里切特的手势距离和背面翻译评估中都显着优于现有的SLP模型。
translated by 谷歌翻译
诊断阿尔茨海默氏病(AD)涉及故意诊断过程,这是由于其先天性的不可逆性特征和微妙而逐渐发展。这些特征使AD生物标志物从结构性脑成像(例如结构MRI)扫描非常具有挑战性。此外,很有可能与正常衰老纠缠在一起。我们通过使用临床引导的原型学习,通过可解释的AD可能性图估计(XADLIME)提出了一种新颖的深度学习方法,用于在3D SMRIS上进行AD进展模型。具体而言,我们在潜在临床特征的簇上建立了一组拓扑感知的原型,发现了AD光谱歧管。然后,我们测量潜在临床特征和完善的原型之间的相似性,估计“伪”可能性图。通过将此伪图视为丰富的参考,我们采用估计网络来估算3D SMRI扫描的AD可能性图。此外,我们通过从两个角度揭示了可理解的概述:临床和形态学,促进了这种可能性图的解释性。在推断期间,这张估计的似然图可以替代看不见的SMRI扫描,以有效地执行下游任务,同时提供彻底的可解释状态。
translated by 谷歌翻译
允许合成现实细胞形状的方法可以帮助生成训练数据集,以改善生物医学图像中的细胞跟踪和分割。细胞形状合成的深层生成模型需要对细胞形状进行轻巧和柔性表示。但是,通常使用体素的表示不适合高分辨率形状合成,而多边形网格在建模拓扑变化(例如细胞生长或有丝分裂)时具有局限性。在这项工作中,我们建议使用符号距离功能(SDF)的级别集来表示细胞形状。我们将神经网络优化为3D+时域中任何点的SDF值的隐式神经表示。该模型以潜在代码为条件,从而允许合成新的和看不见的形状序列。我们在生长和分裂的秀丽隐杆线虫细胞上进行定量和质量验证方法,并具有生长的复杂丝虫突起的肺癌细胞。我们的结果表明,合成细胞的形状描述符类似于真实细胞的形状,并且我们的模型能够在3D+时间内生成复杂细胞形状的拓扑合理序列。
translated by 谷歌翻译
当前的关键字发现系统通常通过大量预定义的关键字进行培训。在开放式摄影设置中识别关键字对于个性化智能设备互动至关重要。为了实现这一目标,我们提出了一个基于MLPMixer的纯粹基于MLP的神经网络,该网络是MLPMIXER - 一种MLP模型体系结构,可有效取代视觉变压器中的注意机制。我们研究了将mlpmixer体系结构适应QBYE开放式录音录一下关键字点斑点任务的不同方法。与最先进的RNN和CNN模型的比较表明,我们的方法在挑战性情况(10DB和6DB环境)上都在公开可用的HEY-SNIPS数据集和具有400个扬声器的更大规模的内部数据集上取得了更好的性能。与基线模型相比,我们提出的模型还具有较少数量的参数和MAC。
translated by 谷歌翻译
来自数据流的在线异常检测对于许多应用程序的安全性至关重要,但是由于来自IoT设备和基于云的基础架构的复杂且不断发展的数据流而面临严重的挑战。不幸的是,现有方法对这些挑战太短。在线异常检测方法承担着处理复杂性的负担,而离线深度异常检测方法则遭受了不断发展的数据分布的影响。本文介绍了一个在线深度异常检测的框架ARCU,可以与任何基于自动编码器的深度异常检测方法实例化。它使用两种新颖的技术使用自适应模型合并方法来处理复杂而不断发展的数据流:概念驱动的推理和漂移感知模型池更新;前者检测到最适合复杂性的模型组合的异常,后者会动态调整模型池以适合不断发展的数据流。在具有高维和概念拖延的十个数据集的全面实验中,Arcus提高了基于最先进的自动编码器的流媒体变体的异常检测准确性,并提高了最新的方法和最新的方法。 ART流动异常检测方法的分别为22%和37%。
translated by 谷歌翻译